Skip to content

Conversation

intbf
Copy link

@intbf intbf commented Sep 19, 2025

Description

When calling subgraph.ToProto don't save initializers, so that we don't bloat the model proto. This is especially important for models with ext initializers in memory that in total contain more than 2GB of data.

Once we have the list of those external initializers, we can update ther metadata inside the proto, so that they leverage the special marker: */_ORT_MEM_ADDR_/*. (The support for this location was recently fixed inside OV with openvinotoolkit/openvino#31632)

Motivation and Context

CVS-173057, CVS-172710

Todo/questions

  • Should we have some condition before using this approach? Maybe when model has those ext weights?
  • The code in GetModelProtoFromFusedNode currenly supports only one case, the regular case, but similar logic have to be applied to other cases like qdq stripping, scale fix, etc.
  • Add unit tests

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the model proto generation to avoid embedding external initializers directly into the proto, preventing potential issues when models exceed the 2GB protobuf limit. The changes introduce a mechanism to handle initializers with external data by updating their metadata to use a special memory address marker instead of including the actual data.

Key changes:

  • Add logic to exclude initializer data from proto generation when load_user_initializer_ is enabled
  • Implement metadata updating for external initializers using the */_ORT_MEM_ADDR_/* marker
  • Add comprehensive logging for debugging initializer processing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@preetha-intel preetha-intel self-requested a review September 24, 2025 04:41
… restores the metadata so OV can read them back

Signed-off-by: bfilipek <[email protected]>
@intbf intbf force-pushed the dont_write_ext_initializers branch from fd29454 to 2b34773 Compare September 26, 2025 13:22
…re are external initializers in memory (more than one)

Signed-off-by: bfilipek <[email protected]>
@intbf intbf requested a review from Copilot September 26, 2025 19:23
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}

static void ReadExternalDataFields(const ONNX_NAMESPACE::TensorProto* src_init, std::string& location, size_t& offset, size_t& length) {
// Remove constness as we need to use mutable_external_data() to get the entries to read.
Copy link
Preview

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a trailing space after the period in the comment on line 497.

Suggested change
// Remove constness as we need to use mutable_external_data() to get the entries to read.
// Remove constness as we need to use mutable_external_data() to get the entries to read.

Copilot uses AI. Check for mistakes.

}
else {
// Debug info for file-based initializers
LOGS(logger, VERBOSE)<< "File-based initializer: "
Copy link
Preview

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space between LOGS(logger, VERBOSE) and << operator.

Suggested change
LOGS(logger, VERBOSE)<< "File-based initializer: "
LOGS(logger, VERBOSE) << "File-based initializer: "

Copilot uses AI. Check for mistakes.

// and bloat the serialized string. We can avoid that by not including the data in the proto
// but then we have to update those initializers and set the external_data fields to mem_addr tag...
// 1 is arbitrary number, but if we have more than 1 external initializer, then the savings are worth the effort
const bool include_initializer_data_in_proto = !(session_context_.has_external_weights == true && extInitializerCount > 1);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is some basic test, to limit the use case for those ext initializers, but I'm open to suggestions. Maybe we even need a flag that is passed down from SessionOptions. If we want to enable this logic for all cases, then we might need some more testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant